Search CORE

204 research outputs found

On the Challenges and Perspectives of Foundation Models for Medical Image Analysis

Author: Metaxas Dimitris
Zhang Shaoting
Publication venue
Publication date: 09/06/2023
Field of study

This article discusses the opportunities, applications and future directions of large-scale pre-trained models, i.e., foundation models, for analyzing medical images. Medical foundation models have immense potential in solving a wide range of downstream tasks, as they can help to accelerate the development of accurate and robust models, reduce the large amounts of required labeled data, preserve the privacy and confidentiality of patient data. Specifically, we illustrate the "spectrum" of medical foundation models, ranging from general vision models, modality-specific models, to organ/task-specific models, highlighting their challenges, opportunities and applications. We also discuss how foundation models can be leveraged in downstream medical tasks to enhance the accuracy and efficiency of medical image analysis, leading to more precise diagnosis and treatment decisions

arXiv.org e-Print Archive

Multispectral Deep Neural Networks for Pedestrian Detection

Author: Liu Jingjing
Metaxas Dimitris N.
Wang Shu
Zhang Shaoting
Publication venue
Publication date: 01/01/2016
Field of study

Multispectral pedestrian detection is essential for around-the-clock applications, e.g., surveillance and autonomous driving. We deeply analyze Faster R-CNN for multispectral pedestrian detection task and then model it into a convolutional network (ConvNet) fusion problem. Further, we discover that ConvNet-based pedestrian detectors trained by color or thermal images separately provide complementary information in discriminating human instances. Thus there is a large potential to improve pedestrian detection by using color and thermal images in DNNs simultaneously. We carefully design four ConvNet fusion architectures that integrate two-branch ConvNets on different DNNs stages, all of which yield better performance compared with the baseline detector. Our experimental results on KAIST pedestrian benchmark show that the Halfway Fusion model that performs fusion on the middle-level convolutional features outperforms the baseline method by 11% and yields a missing rate 3.5% lower than the other proposed architectures.Comment: 13 pages, 8 figures, BMVC 2016 ora

arXiv.org e-Print Archive

Crossref

Predicting Fracture Energies and Crack-Tip Fields of Soft Tough Materials

Author: Lin Shaoting
Yuk Hyunwoo
Zhang Teng
Zhao Xuanhe
Publication venue
Publication date: 13/06/2015
Field of study

Soft materials including elastomers and gels are pervasive in biological systems and technological applications. Whereas it is known that intrinsic fracture energies of soft materials are relatively low, how the intrinsic fracture energy cooperates with mechanical dissipation in process zone to give high fracture toughness of soft materials is not well understood. In addition, it is still challenging to predict fracture energies and crack-tip strain fields of soft tough materials. Here, we report a scaling theory that accounts for synergistic effects of intrinsic fracture energies and dissipation on the toughening of soft materials. We then develop a coupled cohesive-zone and Mullins-effect model capable of quantitatively predicting fracture energies of soft tough materials and strain fields around crack tips in soft materials under large deformation. The theory and model are quantitatively validated by experiments on fracture of soft tough materials under large deformations. We further provide a general toughening diagram that can guide the design of new soft tough materials.Comment: 22 pages, 5 figure

arXiv.org e-Print Archive

DSpace@MIT

High-performance, flexible thermoelectric generator based on bulk materials

Author: Chen gang
Lin Shaoting
Xu Qian
Zhang Lenan
Publication venue: 'Elsevier BV'
Publication date: 22/02/2022
Field of study

the Centers for Mechanical Engineering Research and Education at MIT and SUSTec

DSpace@MIT

Semi-supervised Pathological Image Segmentation via Cross Distillation of Multiple Attentions

Author: Liao Xin
Wang Guotai
Zhang Shaoting
Zhong Lanfeng
Publication venue
Publication date: 30/05/2023
Field of study

Segmentation of pathological images is a crucial step for accurate cancer diagnosis. However, acquiring dense annotations of such images for training is labor-intensive and time-consuming. To address this issue, Semi-Supervised Learning (SSL) has the potential for reducing the annotation cost, but it is challenged by a large number of unlabeled training images. In this paper, we propose a novel SSL method based on Cross Distillation of Multiple Attentions (CDMA) to effectively leverage unlabeled images. Firstly, we propose a Multi-attention Tri-branch Network (MTNet) that consists of an encoder and a three-branch decoder, with each branch using a different attention mechanism that calibrates features in different aspects to generate diverse outputs. Secondly, we introduce Cross Decoder Knowledge Distillation (CDKD) between the three decoder branches, allowing them to learn from each other's soft labels to mitigate the negative impact of incorrect pseudo labels in training. Additionally, uncertainty minimization is applied to the average prediction of the three branches, which further regularizes predictions on unlabeled images and encourages inter-branch consistency. Our proposed CDMA was compared with eight state-of-the-art SSL methods on the public DigestPath dataset, and the experimental results showed that our method outperforms the other approaches under different annotation ratios. The code is available at \href{https://github.com/HiLab-git/CDMA}{https://github.com/HiLab-git/CDMA.}Comment: Provisional Accepted by MICCAI 202

arXiv.org e-Print Archive

Pathology-and-genomics Multimodal Transformer for Survival Outcome Prediction

Author: Ding Kexin
Metaxas Dimitris N.
Zhang Shaoting
Zhou Mu
Publication venue
Publication date: 21/07/2023
Field of study

Survival outcome assessment is challenging and inherently associated with multiple clinical factors (e.g., imaging and genomics biomarkers) in cancer. Enabling multimodal analytics promises to reveal novel predictive patterns of patient outcomes. In this study, we propose a multimodal transformer (PathOmics) integrating pathology and genomics insights into colon-related cancer survival prediction. We emphasize the unsupervised pretraining to capture the intrinsic interaction between tissue microenvironments in gigapixel whole slide images (WSIs) and a wide range of genomics data (e.g., mRNA-sequence, copy number variant, and methylation). After the multimodal knowledge aggregation in pretraining, our task-specific model finetuning could expand the scope of data utility applicable to both multi- and single-modal data (e.g., image- or genomics-only). We evaluate our approach on both TCGA colon and rectum cancer cohorts, showing that the proposed approach is competitive and outperforms state-of-the-art studies. Finally, our approach is desirable to utilize the limited number of finetuned samples towards data-efficient analytics for survival outcome prediction. The code is available at https://github.com/Cassie07/PathOmics.Comment: Accepted to MICCAI2023 (Top14%

arXiv.org e-Print Archive